MQJoin: Efficient Shared Execution of Main-Memory Joins

نویسندگان

  • Darko Makreshanski
  • Georgios Giannikis
  • Gustavo Alonso
  • Donald Kossmann
چکیده

Database architectures typically process queries one-at-a-time, executing concurrent queries in independent execution contexts. Often, such a design leads to unpredictable performance and poor scalability. One approach to circumvent the problem is to take advantage of sharing opportunities across concurrently running queries. In this paper we propose Many-Query Join (MQJoin), a novel method for sharing the execution of a join that can efficiently deal with hundreds of concurrent queries. This is achieved by minimizing redundant work and making efficient use of mainmemory bandwidth and multi-core architectures. Compared to existing proposals, MQJoin is able to efficiently handle larger workloads regardless of the schema by exploiting more sharing opportunities. We also compared MQJoin to two commercial mainmemory column-store databases. For a TPC-H based workload, we show that MQJoin provides 2-5x higher throughput with significantly more stable response times.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast similarity join for multi-dimensional data

To appear in Information Systems Journal, Elsevier, 2005 The efficient processing of multidimensional similarity joins is important for a large class of applications. The dimensionality of the data for these applications ranges from low to high. Most existing methods have focused on the execution of high-dimensional joins over large amounts of disk-based data. The increasing sizes of main memor...

متن کامل

Execution replay for an MPI-based multi-threaded runtime system

In this paper we present an execution replay system for Athapascan, an MPI-based multi-threaded runtime system. The main challenge of this work was to deal with nondeterministic features of MPI promiscuous communications and varying number of test functions without compromising the efficiency of an existing solution for execution replay of shared memory thread based programs. Novel solutions we...

متن کامل

Partitioning Inverted Lists for Efficient Evaluation of Set-Containment Joins in Main Memory

We present an algorithm for efficient processing of set-containment joins in main memory. Our algorithm uses an index structure based on inverted files. We focus on improving performance of the algorithm in a main-memory environment by utilizing the L2 CPU cache more efficiently. To achieve this, we employ some optimizations including partitioning the inverted lists and compressing the intermed...

متن کامل

GPU processing of theta-joins

The GPGPU paradigm has been recently employed to accelerate the processing of big amounts of data through the utilization of the massive parallelism offered by modern GPUs. To date, several techniques have been proposed for the implementation of simple select, aggregate and equality join operations on GPUs. In this paper, we study the efficient implementation of theta-join queries between two r...

متن کامل

Memory Efficient Processing of DNA Sequences in Relational Main-Memory Database Systems

Pipeline breaking operators such as aggregations or joins require database systems to materialize intermediate results. In case that the database system exceeds main memory capacities due to large intermediate results, main-memory database systems experience a massive performance degradation due to paging or even abort queries. In our current research on efficiently analyzing DNA sequencing dat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • PVLDB

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2016